217 research outputs found

    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)

    Get PDF
    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.The PhD Symposium was a very good opportunity for the young researchers to share information and knowledge, to present their current research, and to discuss topics with other students in order to look for synergies and common research topics. The idea was very successful and the assessment made by the PhD Student was very good. It also helped to achieve one of the major goals of the NESUS Action: to establish an open European research network targeting sustainable solutions for ultrascale computing aiming at cross fertilization among HPC, large scale distributed systems, and big data management, training, contributing to glue disparate researchers working across different areas and provide a meeting ground for researchers in these separate areas to exchange ideas, to identify synergies, and to pursue common activities in research topics such as sustainable software solutions (applications and system software stack), data management, energy efficiency, and resilience.European Cooperation in Science and Technology. COS

    A multi-tier cached I/O architecture for massively parallel supercomputers

    Get PDF
    Recent advances in storage technologies and high performance interconnects have made possible in the last years to build, more and more potent storage systems that serve thousands of nodes. The majority of storage systems of clusters and supercomputers from Top 500 list are managed by one of three scalable parallel file systems: GPFS, PVFS, and Lustre. Most large-scale scientific parallel applications are written in Message Passing Interface (MPI), which has become the de-facto standard for scalable distributed memory machines. One part of the MPI standard is related to I/O and has among its main goals the portability and efficiency of file system accesses. All of the above mentioned parallel file systems may be accessed also through the MPI-IO interface. The I/O access patterns of scientific parallel applications often consist of accesses to a large number of small, non-contiguous pieces of data. For small file accesses the performance is dominated by the latency of network transfers and disks. Parallel scientific applications lead to interleaved file access patterns with high interprocess spatial locality at the I/O nodes. Additionally, scientific applications exhibit repetitive behaviour when a loop or a function with loops issues I/O requests. When I/O access patterns are repetitive, caching and prefetching can effectively mask their access latency. These characteristics of the access patterns motivated several researchers to propose parallel I/O optimizations both at library and file system levels. However, these optimizations are not always integrated across different layers in the systems. In this dissertation we propose a novel generic parallel I/O architecture for clusters and supercomputers. Our design is aimed at large-scale parallel architectures with thousands of compute nodes. Besides acting as middleware for existing parallel file systems, our architecture provides on-line virtualization of storage resources. Another objective of this thesis is to factor out the common parallel I/O functionality from clusters and supercomputers in generic modules in order to facilitate porting of scientific applications across these platforms. Our solution is based on a multi-tier cache architecture, collective I/O, and asynchronous data staging strategies hiding the latency of data transfer between cache tiers. The thesis targets to reduce the file access latency perceived by the data-intensive parallel scientific applications by multi-layer asynchronous data transfers. In order to accomplish this objective, our techniques leverage the multi-core architectures by overlapping computation with communication and I/O in parallel threads. Prototypes of our solutions have been deployed on both clusters and Blue Gene supercomputers. Performance evaluation shows that the combination of collective strategies with overlapping of computation, communication, and I/O may bring a substantial performance benefit for access patterns common for parallel scientific applications.-----------------------------------------------------------------------------------------------------------------------------En los últimos años se ha observado un incremento sustancial de la cantidad de datos producidos por las aplicaciones científicas paralelas y de la necesidad de almacenar estos datos de forma persistente. Los sistemas de ficheros paralelos como PVFS, Lustre y GPFS han ofrecido una solución escalable para esta demanda creciente de almacenamiento. La mayoría de las aplicaciones científicas son escritas haciendo uso de la interfaz de paso de mensajes (MPI), que se ha convertido en un estándar de-facto de programación para las arquitecturas de memoria distribuida. Las aplicaciones paralelas que usan MPI pueden acceder a los sistemas de ficheros paralelos a través de la interfaz ofrecida por MPI-IO. Los patrones de acceso de las aplicaciones científicas paralelas consisten en un gran número de accesos pequeños y no contiguos. Para tamaños de acceso pequeños, el rendimiento viene limitado por la latencia de las transferencias de red y disco. Además, las aplicaciones científicas llevan a cabo accesos con una alta localidad espacial entre los distintos procesos en los nodos de E/S. Adicionalmente, las aplicaciones científicas presentan típicamente un comportamiento repetitivo. Cuando los patrones de acceso de E/S son repetitivos, técnicas como escritura demorada y lectura adelantada pueden enmascarar de forma eficiente las latencias de los accesos de E/S. Estas características han motivado a muchos investigadores en proponer optimizaciones de E/S tanto a nivel de biblioteca como a nivel del sistema de ficheros. Sin embargo, actualmente estas optimizaciones no se integran siempre a través de las distintas capas del sistema. El objetivo principal de esta tesis es proponer una nueva arquitectura genérica de E/S paralela para clusters y supercomputadores. Nuestra solución está basada en una arquitectura de caches en varias capas, una técnica de E/S colectiva y estrategias de acceso asíncronas que ocultan la latencia de transferencia de datos entre las distintas capas de caches. Nuestro diseño está dirigido a arquitecturas paralelas escalables con miles de nodos de cómputo. Además de actuar como middleware para los sistemas de ficheros paralelos existentes, nuestra arquitectura debe proporcionar virtualización on-line de los recursos de almacenamiento. Otro de los objeticos marcados para esta tesis es la factorización de las funcionalidades comunes en clusters y supercomputadores, en módulos genéricos que faciliten el despliegue de las aplicaciones científicas a través de estas plataformas. Se han desplegado distintos prototipos de nuestras soluciones tanto en clusters como en supercomputadores. Las evaluaciones de rendimiento demuestran que gracias a la combicación de las estratégias colectivas de E/S y del solapamiento de computación, comunicación y E/S, se puede obtener una sustancial mejora del rendimiento en los patrones de acceso anteriormente descritos, muy comunes en las aplicaciones paralelas de caracter científico

    Work in progress about enhancing the programmability and energy efficiency of storage in HPC and cloud environments

    Get PDF
    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.We present the work in progress for the PhD thesis titled “Enhancing the programmability and energy efficiency of storage in HPC and cloud environments”. In this thesis, we focus on studying and optimizing data movement across different layers of the operating system’s I/O stack. We study the power consumption during I/O-intensive workloads using sophisticated software and hardware instrumentation, collecting time series data from internal ATX power lines that feed every system component, and several run-time operating system metrics. Data exploration and data analysis reveal for each I/O access pattern various power and performance regimes. These regimes show how power is used by the system as data moved through the I/O stack. We use this knowledge to build I/O power models that are able to predict power consumption for different I/O workloads, and optimize the CPU device driver that manage performance states to obtain great power savings (over 30%). Finally, we develop new mechanisms and abstractions that allow co-located virtual machines to share data with each other more efficiently. Our virtualized data sharing solution reduces data movement among virtual domains, leading to energy savings I/O performance improvements.European Cooperation in Science and Technology. COS

    A generic I/O architecture for data-intensiveapplications based on in-memorydistributed cache

    Get PDF
    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.The evolution in scientific computing towards data-intensive applications and the increase of heterogeneity in the computing resources, are exposing new challenges in the I/O layer requirements. We propose a generic I/O architecture for data-intensive applications based on in-memory distributed caching. This solution leverages the evolution of network capacities and the price drop in memory to improve I/O performance for I/O-bounded applications adaptable to existing high-performance scenarios. We have showed the potential improvementsEuropean Cooperation in Science and Technology. COSTThis work is partially supported by EU under the COST Program Action IC1305: Network for Sustainable Ultrascale Computing (NESUS). This work is partially supported by the grant TIN2013-41350-P, Scalable Data Management Techniques for High-End Computing Systems from the Spanish Ministry of Economy and Competitiveness

    High-level programming for heterogeneous and hierarchical parallel systems

    Get PDF
    High-Level Heterogeneous and Hierarchical Parallel Systems (HLPGPU) aims to bring together researchers and practitioners to present new results and ongoing work on those aspects of high-level programming relevant, or specific to general-purpose computing on graphics processing units (GPGPUs) and new architectures. The 2016 HLPGPU symposium was an event co-located with the HiPEAC conference in Prague, Czech Republic. HLPGPU is targeted at high-level parallel techniques, including programming models, libraries and languages, algorithmic skeletons, refactoring tools and techniques for parallel patterns, tools and systems to aid parallel programming, heterogeneous computing, timing analysis and statistical performance models.PostprintPeer reviewe

    Proceedings of the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015) Krakow, Poland

    Get PDF
    Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015

    Exposing data locality in HPC-based systems by using the HDFS backend

    Get PDF
    This work was partially supported by the project “CABAHLA-CM: Convergencia Big data-Hpc: de los sensores a las Aplicaciones” S2018/TCS4423 from Madrid Regional Government and the European Union’s Horizon 2020 research, New Data Intensive Computing Methods for High-End and Edge Computing Platforms (DECIDE). Ref. PID2019-107858GB-I00 and innovation program under grant agreement No 801091, project “ÀSPIDE: Exascale programming models for extreme data processing”

    New directions in mobile, hybrid, and heterogeneous clouds for cyberinfrastructures

    Get PDF
    With the increasing availability of mobile devices and data generated by end-users, scientific instruments and simulations solving many of our most important scientific and engineering problems require innovative technical solutions. These solutions should provide the whole chain to process data and services from the mobile users to the cloud infrastructure, which must also integrate heterogeneous clouds to provide availability, scalability, and data privacy. This special issue presents the results of particular research works showing advances on mobile, hybrid, and heterogeneous clouds for modern cyberinfrastructures

    Boosting analyses in the life sciences via clusters, grids and clouds

    Get PDF
    In the last 20 years, computational methods have become an important part of developing emerging technologies for the field of bioinformatics and biomedicine. Those methods rely heavily on large scale computational resources as they need to manage Tbytes or Pbytes of data with large-scale structural and functional relationships, TFlops or PFlops of computing power for simulating highly complex models, or many-task processes and workflows for processing and analyzing data. This special issue contains papers showing existing solutions and latest developments in Life Sciences and Computing Sciences to collaboratively explore new ideas and approaches to successfully apply distributed IT-systems in translational research, clinical intervention, and decision-making. (C) 2016 Published by Elsevier B.V
    • …
    corecore